Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add runner to wait until snapshot has been created #1047

Merged
merged 4 commits into from
Aug 14, 2020

Conversation

dliappis
Copy link
Contributor

For snapshot creation (and metrics) so far we've relied only the
create-snapshot runner with a blocking call.

In this commit we are introducing a new runner
wait-for-snapshot-createto complement create-snapshot. This is
similar to restore-snapshot and wait-for-recovery and helps in cases
where network connections maybe terminated making a blocking call
unsuitable.

@dliappis dliappis added enhancement Improves the status quo :Track Management New operations, changes in the track format, track download changes and the like labels Aug 13, 2020
@dliappis dliappis added this to the 2.0.2 milestone Aug 13, 2020
@dliappis dliappis self-assigned this Aug 13, 2020
For snapshot creation (and metrics) so far we've relied only the
`create-snapshot` runner with a blocking call.

In this commit we are introducing a new runner
`wait-for-snapshot-create`to complement `create-snapshot`. This is
similar to `restore-snapshot` and `wait-for-recovery` and helps in cases
 where network connections maybe terminated making a blocking call
 unsuitable.
@dliappis dliappis force-pushed the add-wait-for-snapshot-create branch from 065beaf to 293922e Compare August 13, 2020 18:17
Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I left a few comments.

docs/track.rst Outdated
@@ -1119,7 +1119,24 @@ With the operation ``create-snapshot`` you can `create a snapshot <https://www.e
* ``request-params`` (optional): A structure containing HTTP request parameters.

.. note::
When ``wait-for-completion`` is set to ``true`` Rally will report the achieved throughput in byte/s.
It's not recommend to rely on ``wait-for-completion=true``. Instead you should keep the default value (``False``) and use an additional ``wait-for-snapshot-create`` operation in the next step.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "not recommend" -> "not recommended"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 32ee4ab

docs/track.rst Outdated
@@ -1119,7 +1119,24 @@ With the operation ``create-snapshot`` you can `create a snapshot <https://www.e
* ``request-params`` (optional): A structure containing HTTP request parameters.

.. note::
When ``wait-for-completion`` is set to ``true`` Rally will report the achieved throughput in byte/s.
It's not recommend to rely on ``wait-for-completion=true``. Instead you should keep the default value (``False``) and use an additional ``wait-for-snapshot-create`` operation in the next step.
This is mandatory on the `Elastic Cloud <https://www.elastic.co/cloud>`_ or environments where Elasticsearch is sitting behind a network element that may terminate the blocking connection after a timeout.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • on the Elastic Cloud -> on Elastic Cloud?
  • "sitting behind a network element" -> "connected via intermediate network components, such as proxies, that may terminate ..."?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 32ee4ab

# Possible states:
# https://www.elastic.co/guide/en/elasticsearch/reference/current/get-snapshot-status-api.html#get-snapshot-status-api-response-body
if response_state == "FAILED":
self.logger.error("Snapshot [%s] failed. Response status:\n%s", snapshot, json.dumps(response))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • "Response status" -> "Response"? (as it is the actual full, response). I also think we should pretty-print it by specifying e.g. indent=2 in json.dumps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 32ee4ab

# https://www.elastic.co/guide/en/elasticsearch/reference/current/get-snapshot-status-api.html#get-snapshot-status-api-response-body
if response_state == "FAILED":
self.logger.error("Snapshot [%s] failed. Response status:\n%s", snapshot, json.dumps(response))
raise exceptions.RallyAssertionError(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Any reason why it is formatted this way? I tried putting everything on the same line and ended up with a line width of 110 which should still be fine?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 32ee4ab

@@ -420,7 +420,7 @@ class OperationType(Enum):
RawRequest = 5
WaitForRecovery = 6
CreateSnapshot = 7
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could move CreateSnapshot again to administrative actions (i.e. anything > 1000) because Rally would not report request metrics by default for administrative operations (it's usually not interesting to know).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 32ee4ab

@@ -2756,6 +2757,140 @@ async def test_create_snapshot_wait_for_completion(self, es):
}
})

params = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I cannot comment at the specific line but in test_create_snapshot_no_wait we still mock es.snapshot.status and assert later on that it is not called. IMHO we should remove this now because there is no chance we'd ever call it anymore in the new runner implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 32ee4ab + 2b242b2 + 6523ec9


r = runner.WaitForSnapshotCreate()

logger = logging.getLogger("esrally.driver.runner")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about we just use the runner's logger with logger = r.logger (you can then probably just inline it in the with statement below? This would also make it more refactoring safe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 32ee4ab

docs/track.rst Outdated
* ``snapshot`` (mandatory): The name of the snapshot that this operation will wait until it succeeds.
* ``completion-recheck-wait-period`` (optional, defaults to 1 second): Time in seconds to wait in between consecutive attempts.

Rally will report the achieved throughput in byte/s, the duration in seconds, the start and stop time in milliseconds and the total amount of files snapshotted as returned by the the `Elasticsearch snapshot status API call <https://www.elastic.co/guide/en/elasticsearch/reference/current/get-snapshot-status-api.html>`_.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I'd document all the metadata that we return (except for the throughput)? Otherwise we should probably also document the respective metric keys and make clear that these metadata are only available with an Elasticsearch metrics store (contrary to the throughput).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 32ee4ab

@dliappis
Copy link
Contributor Author

Thanks for your comments! Could you PTAL?

Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating. LGTM!

@dliappis dliappis merged commit be8622a into elastic:master Aug 14, 2020
@dliappis dliappis deleted the add-wait-for-snapshot-create branch August 14, 2020 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improves the status quo :Track Management New operations, changes in the track format, track download changes and the like
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants